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Abstract — In this paper we consider tlie problem of full-duplex 
multiple-input multiple-output (MIMO) relaying between multi- 
antenna source and destination nodes. The principal difficulty in 
implementing such a system is that, due to the limited attenuation 
between the relay's transmit and receive antenna arrays, the 
relay's outgoing signal may overwhelm its limited-dynamic-range 
input circuitry, making it difficult — if not impossible — to recover 
the desired incoming signal. While explicitly modeling transmit- 
ter/receiver dynamic-range limitations and channel estimation 
error, we derive tight upper and lower bounds on the end- 
to-end achievable rate of decode-and-forward-based full-duplex 
MIMO relay systems, and propose a transmission scheme based 
on maximization of the lower bound. The maximization requires 
us to (numerically) solve a nonconvex optimization problem, for 
which we detail a novel approach based on bisection search and 
gradient projection. To gain insights into system design tradeoffs, 
we also derive an analytic approximation to the achievable rate 
and numerically demonstrate its accuracy. We then study the 
behavior of the achievable rate as a function of signal-to-noise 
ratio, interference-to-noise ratio, transmitter/receiver dynamic 
range, number of antennas, and training length, using optimized 
half-duplex signaling as a baseline. 

Keywords: MIMO relays, full-duplex relays, limited dynamic 
range, channel estimation. 



I. Introduction 

We consider the problem of communicating from a source 
node to a destination node through a relay node. Traditional 
relay systems operate in a half-duplex mode, whereby the 
time-frequency signal-space used for the source-to-relay link 
is kept orthogonal to that used for the relay-to-destination link, 
such as with non-overlapping time periods or frequency bands. 
Half-duplex operation is used to avoid the high levels of relay 
self-interference that are faced with full-duple)(Q operation (see 
Fig. [T]), where the source and relay share a common time- 
frequency signal-space. For example, it is not unusual for the 
ratio between the relay's self-interference power and desired 
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incoming signal power to exceed 100 dB |l3l|, or — in general — 
some value larger than the dynamic range of the relay's front- 
end hardware, making it impossible to recover the desired 
signal. The importance of limited dynamic-range (DR) cannot 
be overstressed; notice that, even if the self-interference signal 
was perfectly known, limited-DR renders perfect cancellation 
impossible. 



D 



Fig. 1. Full-duplex MIMO relaying from source to destination. Solid lines 
denote desired propagation and dashed lines denote interference. 

Recently, multiple-input multiple-output (MIMO) relaying 
has been proposed as a means of increasing spectral effi- 
ciency (e.g., m. Is)). By MIMO relaying, we mean that 
the source, relay, and destination each use multiple antennas 
for both reception and transmission. MIMO relaying brings 
the possibility of full-duplex operation through spatial self- 
interference suppression (e.g., [S], ll6l- lfT5l ). As a simple 
example, one can imagine using the relay's transmit array to 
form spatial nulls at a subset of the relay's receive antennas, 
which are then free of self-interference and able to recover 
the desired signal. In forming these nulls, however, it can be 
seen that the relay consumes spatial degrees-of-freedom that 
could have been used in communicating data to the destination. 
Thus, maximizing the end-to-end throughput involves navigat- 
ing a tradeoff between the source-to-relay link and relay-to- 
destination link. Of course, maximizing end-to-end throughput 
is more involved than simply protecting an arbitrary subset 
of the relay's receive antennas; one also needs to consider 
which subset to protect, and the degree to which each of 
those antennas are protected, given the source-to-relay and 
relay-to-destination MIMO channel coefficients, the estimation 
errors on those coefficients, and the DR limitations of the 
various nodes. These considerations motivate the following 
fundamental questions about full-duplex MIMO relaying in 
the presence of self-interference: 1) What is the maximum 
achievable end-to-end throughput under a transmit power 
constraint? 2) How can the system be designed to achieve 
this throughput? 

In this paper, we aim to answer these two fundamental 
questions while paying special attention to the effects of both 
limited-DR and channel estimation error 

1) Limited-DR is a natural consequence of non-ideal am- 
plifiers, oscillators, analog-to-digital converters (ADCs), 
and digital-to-analog converters (DACs). To model the 
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effects of limited receiver-DR, we inject, at each receive 
antenna, an additive white Gaussian "receiver distortion" 
with variance /3 times the energy impinging on that 
receive antenna (where /3 ^ 1). Similarly, to model 
the effects of limited transmitter-DR, we inject, at each 
transmit antenna, an additive white Gaussian "trans- 
mitter noise" with variance k times the energy of the 
intended transmit signal (where k <C 1). Thus, k^^ and 
characterize the transmitter and receiver dynamic 
ranges, respectively. 
2) Imperfect CSI can result for several reasons, including 
channel time-variation, additive noise, and DR limita- 
tions. We focus on CSI imperfections that result from the 
use of pilot-aided least-squares (LS) channel estimation 
performed in the presence of limited-DR. 
Moreover, we consider regenerative relays that decode-and- 
forward (as in 13], ll6l- lfT0l ). as opposed to simpler non- 
regenerative relays that only amplify-and-forward (as in ifTTI - 

CSI). 

The contributions of this paper are as follows. For the 
full-duplex MIMO relaying problem, an explicit model for 
transmitter/receiver-DR Umitations is proposed; pilot-aided 
least-squares MIMO-channel estimation, under DR limita- 
tions, is analyzed; the residual self-interference, from DR 
limitations and channel-estimation error, is analyzed; lower 
and upper bounds on the achievable rate are derived; a 
transmission scheme is proposed based on maximizing the 
achievable-rate lower bound subject to a power constraint, 
requiring the solution of a nonconvex optimization problem, 
to which we apply bisection search and Gradient Projection; 
an analytic approximation of the maximum achievable rate is 
proposed; and, the achievable rate is numerically investigated 
as a function of signal-to-noise ratio, interference-to-noise 
ratio, transmitter/receiver dynamic range, number of antennas, 
and number of pilots. 

The paper is structured as follows. In Section |II] we state 
our channel model, limited-DR model, and assumptions on the 
transmission protocol. Then, in Section |III1 we derive upper 
and lower bounds on the achievable rate under pilot-aided 
channel estimation and partial self-interference cancellation 
at the relay. In Section IIVI we propose a novel transmission 
scheme that is based on maximizing the achievable-rate lower- 
bound subject to a power constraint and, in Section |V] we 
derive a closed-form approximation of the optimized achiev- 
able rate whose accuracy is numerically verified. Then, in 
Section |VI| we numerically investigate achievable rate as a 
function of the SNRs [p^^p^), the INRs {rix,i]a), the dynamic 
range parameters (k, /3), the number of antennas {N^^N^), 
and the training length T, and we also investigate the gain 
of full-duplex signaling (over half-duplex) and partial self- 
interference cancellation. Finally, in Section IVlIl we conclude. 

Notation: We use (•)^ to denote transpose, (•)* conjugate, 
and (•)'"' conjugate transpose. For matrices A,B € C^^^ , we 
use tr(A) to denote trace, dct{A) to denote determinant, AQ 
B to denote elementwise (i.e., Hadamard) product, sum(A) e 
C to denote the sum over all elements, vec{A) e 
denote vectorization, diag(j4) to denote the diagonal matrix 
with the same diagonal elements as A, Diag(a) to denote the 



diagonal matrix whose diagonal is constructed from the vector 
a, and [A]„,,„ to denote the element in the m*'^ row and n*'' 
column of A. We denote expectation by E{-}, covariance by 
Cov{ }, statistical independence by _LL, the circular complex 
Gaussian pdf with mean vector m and covariance matrix Q by 
CN{m, Q), and the Kronecker delta sequence by 5k- Finally, 
/ denotes the identity matrix, C the complex field, and Z+ 
the positive integers. 

II. System Model 

We will use TVs and Nj to denote the number of transmit 
antennas at the source and relay, respectively, and Mj and 
to denote the number of receive antennas at the relay 
and destination, respectively. Here and in the sequel, we use 
subscript-S for source, subscript-r for relay, and subscript-d 
for destination. Similarly, we will use subscript-sr for source- 
to-relay, subscript-rd for relay-to-destination, subscript-rr for 
relay-to-relay, and subscript-sd for source-to-destination. At 
times, we will omit the subscripts when referring to common 
quantities. For example, we will use s{t) £ to denote 
the time t G Z+ noisy signals radiated by the transmit antenna 
arrays, and u{t) G C*^ to denote the time-i undistorted signals 
collected by the receive antenna arrays. More specifically, the 
source's and relay's radiated signals are Ss{t) £ and 
Sr{t) e respectively, while the relay's and destination's 
collected signals are Ur{t) e C'*^' and ttd(i) G C^^"*, respec- 
tively. 

A. Propagation Channels 

We assume that propagation between each transmitter- 
receiver pair can be characterized by a Raleigh-fading MIMO 
channel H G C^^^^ corrupted by additive white Gaussian 
noise (AWGN) n{t). By "Rayleigh fading," we mean that 
vec{H) CA/'(0, Ia/jv), and by "AWGN," we mean that 
n{t) ~ CAf{0, Im)- The time-f radiated signals s{t) are then 
related to the received signals u{t) via 

Ur{t) = ^/JhHsrSsit) + ^,H„S,{t) + n,{t) (1) 
Ud{t) = y%HrcjSr{t) + y^HstiSsit) + na{t). (2) 

In ([T]i-(|2]i, pf > and pa > denote the signal-to-noise 
ratio (SNR) at the relay and destination, while i]r > and 
77cj > denote the interference-to-noise ratio (INR) at the relay 
and destination. (As described in the sequel, the destination 
treats the source-to-destination link as interference). The INR 
rjr will depend on the separation between, and orientation of, 
the relay's transmit and receive antenna arrays ITOl . whereas 
the INR ryd will depend on the separation between source and 
destination modems, so that typically r/d ^ Vr- We emphasize 
that ([T]i-(|2]i models the channels H^r, H„, H^d, and H^^, as 
time-invariant quantities. 

B. Transmission Protocol 

For full-duplex decode-and-forward relaying, we partition 
the time indices t — 0, 1, 2, . . . into a sequence of commu- 
nication epochs {7i}^o where, during epoch Ti C Z+, the 
source communicates the i*'' information packet to the relay. 
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while simultaneously the relay communicates the (i — 1)*'' 
information packet to the destination. Before the first data 
communication epoch, we assume the existence of a training 
epoch Ttrain during which the modems estimate the channel 
state. From the estimated channel state, the data commu- 
nication design parameters are optimized and the resulting 
parameters are used for every data communication epoch. 
Since the design and analysis will be identical for every data- 
communication epoch (as a consequence of channel time- 
invariance), we suppress the index i in the sequel and refer to 
an arbitrary data communication epoch as Tdata- 

The training epoch is partitioned into two equal-length 
periods (i.e., Ttrain [1] and Ttrain [2]) to avoid self-interference 
when estimating the channel matrices. Each data epoch is 
also partitioned into two periods (i.e., Tdata[l] and Tdata[2]) 
of normalized duration t e [0, 1] and 1 — r, respectively, 
over which the transmission parameters can be independently 
optimized. As we shall see in the sequel, such flexibility is 
critical when the INR rjr is large relative to the SNR pr. 
Moreover, this latter partitioning allows us to formulate both 
half- and full-duplex schemes as special cases of a more 
general transmission protocol. For use in the sequel, we find 
it convenient to define t[1] = t and t[2] = 1 — t. Within each 
of these periods, we assume that the transmitted signals are 
zero-mean and wide-sense stationary. 

C. Limited Transmitter Dynamic Range 

We model the effect of limited transmitter dynamic range 
(DR) by injecting, per transmit antenna, an independent zero- 
mean Gaussian "transmitter noise" whose variance is k times 
the energy of the intended transmit signal at that antenna. 
In particular, say that x{t) G denotes the transmitter's 
intended time-t transmit signal, and say Q ^ Cov{a;(t)} over 
the relevant time period (e.g., t <E Tdata [I])- We then write the 
time-t noisy radiated signal as 

r c(f) -C7V(0,Kdiag(g)) 
s{t) ^ x{t) + c{t) s.t. I c{t)ALx{t) (3) 

where c{t) G denotes transmitter noise and _LL statistical 
independence. Typically, k <C 1. As shown by measurements 
of various hardware setups (e.g., lfT6l . ifTTl ). the indepen- 
dent Gaussian noise model in (O closely approximates the 
combined effects of additive power-amp noise, non-linearities 
in the DAC and power-amp, and oscillator phase noise. 
Moreover, the dependence of the transmitter-noise variance 
on intended signal power in (|3]l follows directly from the 
definition of limited dynamic range. 

D. Limited Receiver Dynamic Range 

We model the effect of limited receiver-DR by injecting, per 
receive antenna, an independent zero-mean Gaussian "receiver 
distortion" whose variance is (3 times the energy collected by 
that antenna. In particular, say that u{t) e C^^ denotes the 
receiver's undistorted time-t received vector, and say $ = 



Cov{u{t)} over the relevant time period (e.g., t € Tdata [I])- 
We then write the distorted post-ADC received signal as 

r e(i) -CA/'(0,/3diag(*)) 
y{t) ^ u{t) + e{t) sx. { e{t)ALu{t) (4) 

where e{t) G C^^ is additive distortion. Typically, (3-^1. 
From a theoretical perspective, automatic gain control (AGC) 
followed by dithered uniform quantization ifTsl yields quan- 
tization errors whose statistics closely match the model (|4|i. 
More importantly, studies (e.g., fl9\) have shown that the 
independent Gaussian distortion model accurately captures 
the combined effects of additive AGC noise, non-linearities 
in the ADC and gain-control, and oscillator phase noise in 
practical hardware. 

Figure |2] summarizes our model. The dashed lines indicate 
that the distortion levels are proportional to mean energy levels 
and not to the instantaneous value. 




Fig. 2. Our model of full-duplex MIMO relaying under limited 
transmitter/receiver-DR. The dashed lines denote statistical dependence. 



III. Analysis of Achievable Rate 

A. Pilot-Aided Channel Estimation 

In this section, we describe the pilot-aided channel estima- 
tion procedure that is used to learn the channel matrices H. 
In our protocol, the training epoch consists of two periods, 
Ttrain [1] and Ttrain [2], each spanning TN channel uses (for 
some T G Z+). For all times t E Ttrain [1], we assume 
that the source transmits a known pilot signal and the relay 
remains silent, while, for all t £ Ttrain [2], the relay transmits 
and the source remains silent. Moreover, we construct the 
pilot sequence X = [x{l), x{TN)] e C^^™ to satisfy 
^XX^ = 7jv, where the scaling has been chosen to satisfy a 
per-period power constraint of the form tr(Q) = 2, consistent 
with the data power constraints that will be described in the 
sequel. 

Our limited transmitter/receiver-DR model implies that the 
(distorted) space-time pilot signal observed by a given receiver 
takes the form 

Y = yGH{X + C) + N + E, (5) 

where a S {pr, ??r, Pd, ??d} for H e {ifsr, i^rr, i^rd, -H'sd}, 
respectively. In Q, C, E and N are N x TN matrices of 
transmitter noise, receiver distortion, and AWGN, respectively. 
At the conclusion of training, we assume that each receiver 
uses least-squares (LS) to estimate the corresponding channel 
H as 

V^H ^ TT^rX^ (6) 
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and communicates this estimate to the transmitter]! In the 
sequel, it will be useful to decompose the channel estimate 
into the true channel plus an estimation error. In Appendix lAl 
it is shown that such a decomposition takes the form 

y^H = ^H + D^H, (7) 

where the entries of H are i.i.d CAf{0, 1), and where 



D = — ( (1 + (3)1 + a—HH 
2TV N 



H 



+ a^(l + «)diag(jfJfH)^ (8) 

characterizes the spatial covariance of the estimation error 
Using (3-^1 and k <C 1, this covariance reduces to 



(9) 



B. Interference Cancellation and Equivalent Channel 

We now describe how the relay partially cancels its self- 
interference, and construct a simplified model for the result. 

Recall that the data communication period is partitioned into 
two periods, 7data[l] and 7data[2], and that — within each — 
the transmitted signals are wide-sense stationary. Thus, at any 
time t £ 7data[^]' the relay's (instantaneous, distorted) observed 
signal takes the form 

Vrit) = {^/VrHsr - r>sWsr)(a;s(i) + Cs(i)) + nr{t) + er{t) 
+ {^,H„ - DlH„){x,{i) + c,{t)), (10) 
as implied by Fig.|2]and O. Defining the aggregate noise term 
Vr{t) ^ ^rHsMt) - DlHsrix^it) + C^{t)) + nr{t) 

+ e,{t) + ^H„cr{t) - DlH„{x,{t) + cr(<)), 

(11) 



we can write the observed signal as yf{t) = yfp'xH^,x^{t) + 
,JrfjHjxXj{t) + Vj{t), where the self-interference term 
y/rfrH„Xf{t) is known and thus can be canceled. The 
interference-canceled signal Zf{t) = Vfit) — ^JrfxH xxXt{t) can 
then be written as 

Zr{t) = ^rH^rXs{t)+Vr{t). (12) 

Equation (fT2] l shows that, in effect, the information signal 
Xs,{t) propagates through a known channel ^JJTxH^x corrupted 
by an aggregate (possibly non-Gaussian) noise Vj{t), whose 
(/fsr, -Ffrr)-conditional covariance we denote as Sr['] — 
CoY{vx{t)\HsuHn}teTi^i^[i], recalling that / e {1,2} in- 
dexes the data-period. In Appendix [B] we show that 

« / + Kprifsrdiag(QsW)«'sr + £>srtr(QsW) 
+ KTi,H„Amg{Q,[l])H^, + b„tT{Q,[l]) 



- p,1rdi&g{H„QAl]H„) 



(13) 



In our transmission protocol, a single training epoch is followed by a 
large number of data epochs, and so the relative training overhead becomes 
neghgible as the number of data epochs grows large. 



where Dgr = E{Dsr | i?sr} and D„ = E{Drr | H„} obey 

If 2k " ^ H 213 / - ' H\\ 
« ^ (^J + o^j;^HH + diag [HH ) j (14) 

and where the approximations in (fT3]l-(fT4li follow from k ^ 1 
and /3 ^ 1. We note, for later use, that the channel estimation 
error terms D can be made arbitrarily small through appro- 
priate choice of T. 

The effective channel from the relay to the destination can 
be similarly stated as 

Vtiit) = VPdHxdMt) + Mt) (15) 
v<iit) = VPdJfrdCr(<) - D}^H,aXr{t) + na{t) + edit) 
+ VVdHsd{xs{t) + cs{t)) - Dl^Hsd{xs{t) 
+ cs(i)), (16) 

and an expression similar to (fT3T l can be derived for 
the destination's aggregate noise covariance, Sci['] — 
Cov{vd{t) I Hrcj,Hsti}te%^,M during data-period / e {1,2}. 
Unlike the relay node, however, the destination node does not 
cancel the interference term y/rj^H^^x^it), but rather lumps 
it in with the aggregate noise V(j{t). The latter practice is well 
motivated under the assumption that r/jj ^ pr, i.e., that the 
source-to-destination link is much weaker than the relay-to- 
destination link. Figure |3] summarizes the equivalent system 
model. 
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Fig. 3. Equivalent model of full-duplex MIMO relaying. 



C. Bounds on Achievable Rate 

The end-to-end mutual information can be written, for a 
given time-sharing parameter t, as ID 



IriQ) = min 




Y.r[l]UQ[l])\ , (17) 



1=1 



where Isr{Q[l]) and /rci(2[^]) are the period-/ mutual infor- 
mations of the source-to-relay channel and relay-to-destination 
channel, respectively, and where Q[l] ^ (QsI^l'QrW) and 

Q^(Q[i],Q[2]). 

To analyze /sr(Q[^]) and /rci(Q[/]), we leverage the equiv- 
alent system model shown in Fig. [3] which includes channel- 
estimation error and relay-self-interference cancellation, and 
treats the source-to-destination link as a source of noise. The 
mutual-information analysis is, however, still complicated by 
the fact that the aggregate noises V;{t) and V(j{t) are generally 
non-Gaussian, as a result of the channel-estimation-error com- 
ponents in (fTTl i and (fTST l. However, it is known that, among 
all noise distributions of a given covariance, the Gaussian 
one is worst from a mutual-information perspective 1201 . In 
particular, treating the noise as Gaussian yields the lower 
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bounds /sr(Q[/]) > Isr(Q[?]) and IMI]) > Ird(QW)' 
where 11211 



= logdet (/ + prHMl]H^,-E;\l] 



(18) 



logdet UrJfsrQsW-H-sr + Srffl - logdct(SrW) (19) 



and 



Ird(Q[ 



|_| 

logdet ( / + paHrdQMHrd^r [ 



= logdet (p^H^aQMHrd + - logdet(Sd[r 



(20) 



(21) 



and thus a lower bound on the end-to-end r-specific 
achievable-rate is 

UQ) = min I ^rW/3,(QW), ^tH/,,(QH) |. (22) 



(=1 



1=1 



Moreover, the rate /^(Q) bit^per-channel-use (bpcu) can be 
achieved via independent Gaussian codebooks at the transmit- 
ters and maximum-likelihood detection at the receivers ||2TI . 

A straightforward achievable-rate upper bound /t(Q) re- 
sults from the case of perfect CSI (i.e., D = 0), where Vt{t) 
and V(i{t) are Gaussian. Moreover, the lower bound /^(Q) 
converges to the upper bound Ir{Q) as the training T oo. 

IV. Transmit Covariance Optimization 

We would now like to find the transmit covariance matrices 
Q that maximize the achievable-rate lower bound /^(Q) in 
subject to the per-link power constraint Q £ Q^, where 



2 

h = \Q s.t. tr {Q,[l]) < 1, tr {Q,[l]) < 1, 

1=1 1=1 

Qsil] = Qs m > 0, QM = Qr[i] > ol, (23) 
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and subsequently optimize the time-sharing parameter t. We 
note that optimizing the transmit covariance matrices is equiv- 
alent to jointly optimizing the transmission beam-patterns 
and power levels. In the sequel, we denote the optimal (i.e., 
maximin) rate, for a given r, by 
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(24) 



and we use to denote the corresponding set of maximin 
covariance designs Q (which are, in general, not unique). 
Then, with t, = argmax^g[o_i] ^, the optimal rate is 
L* — L* T ' and the corresponding set of maximin designs 

is 0,^0'^. 



Throughout the paper, we take "log" to be base-2. 



A. Weighted-Sum-Rate Optimization 

It is important to realize that, within the maximin design 
set Q*_r, there exists at least one "link-equalizing" design, 
i.e., 3Q e Q^^r s.t. Isr.^lQ) = Ird.r(2)- To see why this 
is the case, notice that, given any maximin design Q such 
that > Ird.rlQ)' a simple scaHng of Qg[l] can yield 

Lsr t(2) = i^rd t(2)' and thus an equalizing design. A similar 
argument can be made when /nj.rlQ) > Lsr.riQ)- 

Referring to the set of all link-equalizing designs (maximin 
or otherwise), for a given r, as 



(25) 



the maximin equalizing design can be found 
by solving either argmaxggQ^^ /gr^(Q) or 
argmaxQgQ^ ^ ^(Q), where the equivalence is due 
to the equalizing property. More generally, the maximin 
equalizing design can be found by solving 



arg max /^(Q, C) 

CeQ=,x 



(26) 



with any fixed ( £ [0, 1] and the ^-weighted sum-rate 

UQ, C) = CLsrAQ) + (1 - C)Ird..(Q)- (27) 

To find the maximin equalizing design, we propose relaxing 
the constraint on Q from Q=,r to Qr, yielding the C-weighted- 
sum-rate optimization problem 



Q*,r(C) = arg max LriQX)- 



(28) 



Now, ;/ there exists C= G [0,1] such that the solution 
Q*.r(C=) to ( |28] | is link-equalizing, then, because Q=,r C 
Qt, we know that Q*^i-(C=) must also solve the problem 
(l26l l. implying that Q»^,-(C=) is maximin. Figure U a) illus- 
trates the case where such a exists. It may be, how- 
ever, that no C •= [0,1] yields a link-equalizing solution 
Q*^t(C)' as illustrated in Fig. SJb). This case occurs when 
Isr,r(Q*,r(C)) > Ird,r(S*,r(C)) for all C G [0,1], such as 
when pr ^ pd- In this latter case, the maximin rate reduces to 

I*.r = limc^oird,T(2*,r(C))- 



(a) 



LriQ..AO-X) 



IsrAQ'AC)), 



(b) 

%{Q,AOX) 
4,d..(S.,x(C)) 

•C 



1 



Fig. 4. Illustrative examples of r-specific (^-weighted sum-rate optimization 
in the case (a) when a link-equalizing solution exists and (b) when one does 
not exist. Here, I^^ ^(Q) and /^^ ^(Q) are the source-to-relay and relay-to- 
destination rates, respectively, /^(Q,C) = (Lsr.riQ) + (1 " C)Ird,r(2) 
is the (^-weighted sum-rate, and Q,,r(C) i** the set of optimal covariance 
matrices for a given time-share r and weight 

Whether or not C= € [0, 1] actually exists, we propose 
to search for using bisection, leveraging the fact that 
Ird,T(Q*,T(C)) is non-increasing in C and /gr.r (2*.r (0) is 
non-decreasing in (. To perform the bisection search, we 
initialize the search interval I at [0, 1], and bisect it at each step 
after testing the condition Ird,T(2*,T(C)) > Isr,T(2*,r (0) 
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at the midpoint location C in I\ if the condition holds true, 
we discard the left sub-interval of I, else we discard the 
right sub-interval. We stop bisecting when |Xrd t(2*,i-(C)) ^ 
Zsr t(2*.t(C))I f^lls below a threshold or a maximum number 
of iterations has elapsed. Notice that, even when there exists 
no C= G [0, 1], bisection converges towards the desired weight 
C = 0. Subsequently, we optimize over t e [0, 1] using a 
grid-search. 

B. Gradient Projection 

At each bisection step, we use Gradient Projection (GP) 
to solv^ the T-specific, ^-weighted-sum-rate optimization 
problem ( l28T l. The GP algorithm 1221 is defined as follows. 
For the generic problem of maximizing a function f{x) over 
X & X, the GP algorithm starts with an initialization x^'^'^ and 
iterates the following steps for fc = 0,1,2,3,... 

a;"^-) =7'^(a;(^-)+,s('^)V/(a;W)) 



(29) 
(30) 



where Vx{ ) denotes projection onto the set X and V/(-) 
denotes the gradient of /(•). The parameters 7''^'^ g (0, 1] and 
sC"") act as stepsizes. In the sequel, we assume s''"') = 1 Vfc. 



In applying GP to the optimization problem ( l28l l, we first 
take gradient steps for QJl] and (5|.[2], and then project onto 
the constraint set (|23] |. Next, we take gradient steps for Qq[1\ 
and Qs[2]' ^nd then project onto the constraint set. In sum- 
mary, denoting the relay gradient by Gr[l] = q,[i]Lt{QX)^ 
our GP algorithm iterates the following steps to convergence: 



p\''\2]=q[''\2] + G\'''[2] 



( fe) r 



(31) 
(32) 

(33) 



qJ'^+i) [1] = Ql'> [1] + 7(^-) {Q';> [1] _ Ql'^ [1]) (34) 



Q^'^ [2] = Q^"^ [2] + 7^'=) (g^^ [2] - Q\'' [2]) (35) 

and then repeats similar steps for Q^[l\ and (5s[2]. An outer 
loop then repeats this pair of inner loops until the maximum 
change in Q is below a small positive threshold e. 

We now provide additional details on the GP steps. As for 
the gradient. Appendix |C] shows that the gradient Gx[l] can be 
written as in ( |36] l, at the top of the next page, where 



(37) 
(38) 



For Gs[l], a similar expression can be derived. 

To compute the projection 7';t'(-Pr[l], -Pr[2]), we 
first notice that, due to the Hermitian property of 
Pr\P\, we can construct an eigenvalue decomposition 
Pr[/] = Ur[l]Kr[l]U^[l] with unitary Ur[l] and real- 
valued = Diag(AraH,V2[/],...,Ar,^[/]). The 
projection of (PJl], Pr[2]) onto the constraint set ( |23] l 

* Because (24) is generally non-convex, finding the global maximum can 
be difficult. Although GP is guaranteed only to find a local, and not global, 
maximum, our experience with different initializations suggests that GP is 
indeed finding the global maximum in our problem. 



then equals Q,[l] = U,[l]{A,[l] - ^iI)+U^[l], where 
(JB)+ = max(S, 0) elementwise, and where /i is chosen 
such that X^^^Li ■''['] ma'X(Ar^„[/] — /i, 0) — 1. In essence, 
Vxi ) performs water-filling. 

To adjust the stepsize 7''"'-', we use the Armijo stepsize rule 
[[22], i.e., 7^''') ~ v"^^ where is the smallest nonnegative 
integer that satisfies 



(fe) 



1=1 



-Qr[i[ 



(39) 



for some constants ct, u typically chosen so that a £ 
[10^^, lO^^l and v e [0.1, 0.5]. Above, we used the shorthand 



V. Achievable-Rate Approximation 

The complicated nature of the optimization problem 
motivates us to approximate its solution, i.e., the covariance- 
optimized achievable rate — ui'ax^^^q ^ maxggQ^ /^(Q). 
In doing so, we focus on the case of T 00, where channel 
estimation error is driven to zero so that /^(Q) = It{Q) = 
It{Q)- In addition, for tractability, we restrict ourselves to 
the case A'^s = = ^ and Mr = Ala = (i-e-, N transmit 
antennas and A/ receive antennas at each node), the case 77^ = 
(i.e., no direct source-to-destination link), and the case r = i 
(i.e., equal time-sharing). 

Our approximation is built around the simplifying case 
that the channel matrices {i?sr, H„, i^rd} are each diagonal, 
although not necessarily square, and have R = min{Af, N} 
identical diagonal entries equal to ^/MN/R. (The latter 
value is chosen so that E{tr(J?ij'~')} — MN as assumed 
in Section III-AI ) In this case, the mutual information (|22] | 
becomes ( l40l i. at the top of the next page. When ri^ <C 
Px, the 77r-dependent terms in (|40] i can be ignored, after 
which it is straightforward to show that, under the constraint 
(|23] |. the optimal covariances are the "full duplex" Qfd = 
(^/, i/, i/, iJ), for which m gives 



/(Qfd) 

K log fl + mini P', ^ , , nj'n, \] (41) 

'i?logfn ^ ^ ;f Pr ^ 1 I {i^+f})v,M 



Pd 



i?log 1 



Pr 

J} + (K + /j)(pr + »;r) 



if > 1 , ^ 

else. 



(42) 

When 7]r ^ Pr, the Tyr-dependent term in ( |40] | dominates unless 
Qr[l] = 0. In this case, the optimal covariances are the "half 
duplex" ones Qhd — ij!^' ^' ^' 7f^)' ^^'^ which (|40] | gives 



/(Qhd) 



f logfl 



■ - if Pl > 1 

^H^+fi)P6j Pd - ^ 



Pr 



|- + (K+/3)pr 



else. 



(43) 



Finally, given any triple (pr,?7r,Pd), we approximate the 
achievable rate as follows: w max{/(QFD), -^(Qhd)}- 
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+^|diag(K^r"(5r'W -Sr'W)^frr) +/?^"ddiag(5r'm-Sr'm)^frd|, (36) 



Ir{Q)~^mmij2 ^°Sdct fl + p,^Q^[l] (^I + {k + P)^{pr diag(Q3[?]) + 77^ diag(Q,[;] 
^ 1=1 ^ 

^logdct (l + pd^Qrm(/ + (a^ + /3)^pddiag(gj/]))"') |. (40) 



1=1 



From (gU-dUll, using 6 



it is straightforward to 



show that the approximated system operates as follows. 



1) Say < 1. Then full-duplex is used iff 



1 



2pr 



e. (44) 



2) 



For either half- or full-duplex, /, is invariant to p^j, i.e.. 

the source-to-relay link is the limiting one. 

Say 1 < ^ < 1 + k^di^k^. Full-duplex is used iff 



2pd)2 



2/3d 



3) Say 1 



(K+/j)rjrM 
fl 



is invariant to pr and ?7r, i.e 



(45) 

< or equivalently ry,- < r^crit — 
Then full-duplex is always used, and 
the rate is limited by 
the relay-to-destination link. 
Figure |5] shows a contour plot of the proposed achievable- 
rate approximation as a function of INR rjr and SNR pr, for 
the case that pr/pd = 2. We shall see in Section IVTl that 
our approximation of the covariance-optimized achievable- 
rate is reasonably close to that found by solving (l24l i using 
bisection/GR 

VI. Numerical Results 

In this section, we numerically investigate the behavior of 
the end-to-end rates achievable for full-duplex MIMO relay- 
ing under the proposed limited transmitter/receiver-DR and 
channel-estimation-error models. Recall that, in Section |III] 
it was shown that, for a fixed set of transmit covariance 
matrices Q and time-sharing parameter r, the achievable 
rate It{Q) can be lower-bounded using Lt{Q) from (l22l i. 
and upper-bounded using the perfect-CSI /t(Q), where the 
bounds converge as training T ^ 00. Then, in Section ITVl a 
bisection/GP scheme was proposed to maximize /^(Q) sub- 
ject to the power-constraint Q € Qr, which was subsequently 
maximized over re [0, 1]. 

We now study the average behavior of the bisection/GP- 
optimized rate = max^r maxQgQ^ LriQ) ^ function of 
SNRs Pr and p^j; INRs r/r and r/jj; dynamic range parameters 




40 50 60 

SNR p, [dB] 

Fig. 5. Contour plot of the approximated aciiievable rate /* versus relay 
SNR Pr and INR r)r, for N = 3, M = 4, l3 = k = -40dB, and pr/pd = 2. 
The horizontal dashed line shows the INR rj^rit' ^nd the dark curve shows the 
boundary between full- and half-duplex regimes described in )45t . 



K and f3; number of antennas A^s. ^r, Mr, and M^j; and 
training length T. We also investigate the role of interference 
cancellation, the role of two distinct data periods, the role 
of T-optimization, and the relation to optimized half-duplex 
(OHD) signaling. In doing so, we find close agreement with 
the achievable-rate approximation proposed in Section |V] and 
illustrated in Fig. |5] 

For the numerical results below, the propagation channel 
model from Section III-AI and the limited transmitter/receiver- 
DR models from Section III-CI and Section III-DI were em- 
ployed, pilot-aided channel estimation was implemented as 
in Section IIII-AI and the power constraint ( |23] ) was applied, 
implying the channel-estimation-error covariance (O and the 
aggregate-noise covariance ( fT3b . Throughout, we used N = 
A's = A^r transmit antennas, M = Mr = A/d receive antennas, 
the SNR ratio pr/pd = 2, the destination INR rja = 1, training 
duration T = 50 (as justified below), Armijo parameters 
cr = 0.01 and j/ = 0.2, and GP stopping threshold e = 0.01. 
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For each channel realization, the time-sharing coefficient t 
was optimized over the grid r e {0.1, 0.2, 0.3, . . . , 0.9}, and 
all results were averaged over 100 realizations unless specified 
otherwise. 

Below, we denote the full scheme proposed in Section |IV] 
by "TCO-2-IC," which indicates the use of interference can- 
cellation (IC) and transmit covariance optimization (TCO) 
performed individually over the 2 data periods (i.e., 7data[l] 
and 7data[2]). To test the impact of IC and of two data periods, 
we also implemented the proposed scheme but without IC, 
which we refer to as "TCO-2," as well as the proposed scheme 
with only one data period (i.e., QiW = QJ2] Vi), which we 
refer to as "TCO-l-IC." To optimize | half-duplex, we used GP 
to maximize the sum-rate /^(Q, 5) under the power constraint 
(|23] i and the half-duplex constraint Qi[2] = = Q2[l]' 
optimization was performed as described above. 

To mitigate GP's sensitivity to initialization, we tried two 
initializations for each ^-weighted-sum-rate problem, OHD 
and "naive" full-duplex (NFD), and the one yielding the maxi- 
mum min-rate was retained. OHD was calculated as explained 
above, whereas NFD employed non-zero OHD covariance 
matrices Qi[l] and [2] over both data periods (which is 
indeed optimal when rjr ^ ~ 1]^). Note that both OHD and 
NFD are invariant to (, rjr, and jy^- 

In Fig. |6] we investigate the role of channel-estimation 
training length T on the achievable-rate lower bound /(Q) 
of TCO-2-IC. There we see that the rate increases rapidly in 
T for small values of T, but quickly saturates for larger values 
of T. This behavior can be understood from (fT3]i- (fT4l i. which 
suggest that channel estimation error will have a negligible 
effect on the noise covariances Sr[^] and Sci[/] when TN ^ 1. 
Figure |6] also shows the corresponding achievable-rate upper 
bounds I{Q). These traces confirm that the nominal training 
length T = 50 ensures /(Q) « /(Q) w I{Q). 

In Fig. |7] we examine achievable-rate performance versus 
INR rjr for the TCO-2-IC, TCO-l-IC, TCO-2, and OHD 
schemes, using different dynamic range parameters (3 = k. 
For OHD, we see that rate is invariant to INR rjj, as ex- 
pected. For the proposed TCO-2-IC, we observe "full duplex" 
performance for low-to-mid values of rjj and a transition to 
OHD performance at high values of rj,, just as predicted 
by the approximation in Section [V] In fact, the rates in 
Fig. I2] are very close to the approximated values in Fig. |5] 
To see the importance of two distinct data-communication 
periods, we examine the TCO-l-IC trace, where we observe 
TCO-2-IC-like performance at low-to-midrange values of rjj, 
but performance that drops below OHD at high rjj. Essen- 
tially, TCO-l-IC forces full-duplex signaling at high INR 
r]j, where half-duplex signaling is optimal, while TCO-2-IC 
facilitates the possibility of half-duplex signaling through the 
use of two distinct data-communication periods, similar to the 
MIMO-interference-channel scheme in Il23|. The effect of r- 



^ We note that both half-duplex and the proposed TCO-2-IC scheme could 
potentially benefit from allowing the relay to change the partitioning of 
antennas from transmission to reception across the data period / G {li2}. 
In half duplex mode, for example, it would be advantageous for the relay 
to use (7Vr[l],Mr[l]) = (0,7) and {N,[2], M,[2]) = (7,0) as opposed to 
{Nr[l], Mr[l]) = (3, 4) Vi. We do not consider such antenna-swapping in this 
work, however. 



i;r = OdB 



■i(r = 40dB 



■i(r = 100dB 



40 50 60 

Training Length T 



Fig. 6. Achievable-rate lower bound for TCO-2-IC versus training 
interval T. Here, Ar = 3, M = 4, /3 = k = -40dB, pr = 15dB, pr/Pd = 2, 
and Tjij = OdB. Also shown as a dashed line which is the corresponding upper 
bound /, for each value of Tjr. 



optimization can be seen by comparing the two OHD traces, 
one which uses the fixed value t — 0.5 and the other which 
uses the optimized value t = r*. The separation between these 
traces shows that r-optimization gives a small but noticable 
rate gain. Finally, by examining the TCO-2 trace, we conclude 
that partial interference cancellation is very important for all 
but extremely low or high values of INR rjf. 
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-40dH 
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Fig. 7. Achievable-rate lower bound /. for TCO-2-IC, TCO-2, TCO-l-IC, 
and OHD versus INR ijr. Here, Af = 3^ M = 4, pr = 15dB, pr/pd = 2, 
r]a = OdB, and T = 50. OHD is plotted for /3 = k = -40dB, but was 
observed to give nearly identical rate for /3 = fc = — 80dB. Both fixed-time- 
share (t = 0.5) and optimized-time-share (r = t*) versions of OHD are 
shown. 

In Fig. [8] we examine the rate of the proposed TCO-IC-2 
and OHD versus SNR p^, using the dynamic range parameters 
(3 — n — — 40dB, % = OdB, and two fixed values of 
INR rjf. All the behaviors in Fig. [8] are predicted by the rate 
approximation described in Section IV] and illustrated in Fig.|5] 
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In particular, at the low INR of r], = 20dB, TCO-IC-2 operates 
in the full-duplex regime for all values of SNR pr- Meanwhile, 
at the high INR of j]^ = 60dB, TCO-IC-2 operates in half- 
duplex at low values of SNR pj, but switches to full-duplex 
after pj exceeds a threshold. 



40 




1 1 , , 1 , 1 , , , , 1 

10 20 30 40 50 60 70 80 90 100 
SNR p, (dB) 



Fig. 8. Achievable-rate lower bound for TCO-2-IC and OHD versus 
SNR pr. Here, pr/pd = 2, % = OdB, = 3, A/ = 4, ^ = K = -40dB, 
and T = 50. OHD in this figure is optimized over t. 

In Fig. |9l we plot the GP-optimized rate contours of the 
proposed TCO-IC-2 versus both SNR pr and INR r/r, for 
comparison to the approximation in Fig. |5] The two plots 
show a relatively good match, confirming the accuracy of 
the approximation. The greatest discrepancy between the plots 
occurs when 77^ « pr and both yyr and pr are large, which makes 
sense because the approximation was derived using ^ 
and ty > p,. 



2-IC and OHD versus the number of antennas, N and M, for 
fixed values of SNR pr = 15dB and pr/pd = 2, INR rj, = 
30dB and 770 = OdB, and DR pai-ameters /3 = k = -40dB. 
We recall, from Fig. [T] that these parameters correspond to 
the interesting regime where TCO-2-IC performs between 
half- and full-duplex. In Fig. [TO] we see that achievable 
rate increases with both M and N numbers of antennas, 
as expected. More interesting is the achievable-rate behavior 
when the total number of antennas per modem is fixed, e.g., 
al N + M = 7 , as illustrated by the triangles in Fig. [TO] The 
figure indicates that the configurations {N,M) = (3,4) and 
{N, M) — (4, 3) are best, which (it can be shown) is consistent 
with approximation from Section |V] 




Number of transmit antennas N 



Fig. 10. Achievable-rate lower bound for TC0-2-IC and OHD versus 
number of transmit antennas A'^ with various numbers of receive antennas M . 
Here, p, = 15dB, pr/pd =%V' = 30dB, Tjd = OdB, /3 = k = -40dB, and 
T = 50. OHD shown in this figure is optimized over r. 




40 50 60 

SNR p, (dB) 

Fig. 9. Contour plot of the achievable-rate lower bound 7^ for TCO-2-IC 
versus EMR r], and SNR pr, for p^ = pr/2, rj^ = OdB, Af = 3, M = 4, 
and P = K = — 40dB. The dark curve (i.e., approximate full/half-duplex 
boundary) and dashed line (i.e., critical INR ri^nx) ai'e the same as in Fig. |5] 
and shown for reference. The results are averaged over 250 realizations. 



Finally, in Fig. [TO] we explore the achievable rate of TCO- 



VII. Conclusion 

We considered the problem of decode-and-forward-based 
full-duplex MIMO relaying between a source node and des- 
tination node. In our analysis, we considered limited trans- 
mitter/receiver dynamic range, imperfect CSI, background 
AWGN, and very high levels of self-interference. Using ex- 
plicit models for dynamic-range limitation and pilot-aided 
channel estimation error, we derived upper and lower bounds 
on the end-to-end achievable rate that tighten as the number 
of pilots increases. Furthermore, we proposed a transmission 
scheme based on maximizing the achievable-rate lower-bound. 
The latter requires the solution to a nonconvex optimization 
problem, for which we use bisection search and Gradient 
Projection, the latter of which implicitly performs water- 
filling. In addition, we derived an analytic approximation to 
the achievable rate that agrees closely with the results of the 
numerical optimization. Finally, we studied the achievable-rate 
numerically, as a function of signal-to-noise ratio, interference- 
to-noise ratio, transmitter/receiver dynamic range, number of 
antennas, and number of pilots. In future work, we plan to 
investigate the effect of practical coding/decoding schemes, 
channel time-variation, and bidirectional relaying. 
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Appendix A 
Channel Estimation Details 



which implies that 



In this appendix, we derive certain details of Section IIII-AI 
Under limited transmitter-DR, the undistorted received space- 
time signal is 

U = V^H{X + C) + N, (46) 

where the spatial correlatior@ of the non-distorted pilot signal 
X equals j^I and hence the spatial correlation of the trans- 
mitter distortion C equals ^7. Conditioned on H, the spatial 
correlation of U is then $ = ^^^^^^HH^ + I, and hence 
the 7f-conditional spatial correlation of the receiver distortion 
E equals 



/3diag(*)=/? 



2a(l + k) 
N~ 



diag (hH^) + 1 



(47) 



Given (|5]l, the distorted received signal Y can be written as 

Y = ^HX + W, (48) 

where W = y/aHC + N + E is aggregate complex Gaussian 
noise that is temporally white with iJ-conditional spatial 
correlation ^HH^ + 1 + /3(^^^^ diag(ffi?^) + /). 

Due to the fact that :^XX^ = /, the channel estimate (|6]l 
takes the form 

V^H = ^YX"" = V^H + ^WX"", (49) 

1 LI 

where ^WX is Gaussian channel estimation error We now 
analyze the i?-conditional correlations among the elements of 
the channel estimation error matrix. We begin by noticing 



(50) 
(51) 





—wx^ 




— WX^ 


* 






2T 




2T 


71, q 





= E t^]?^' E { [wUk mil \H}. 

To find E {[VF]™,fc[W]; ( I H}, we recall that 



^7n — n^i 



k-l 



(52) 
(53) 
(54) 



E{[NUk[mn,i\H} 
E{[CU[C];,i\H} 

implying that 

+ E {[iV]™.fe[iV]:, I H] + ^{[EUmii I H] (55) 

p 

,x{TN)] is ¥.{x{t)x(t)^} = 



The spatial correlation of ^ = [a;(l 









^wx- 


* 








ni.p 




71, q 





+ {1 + l3[^]ra,m)S,n-, 



+ (l + /3[*]™,™)5,„_, 



(56) 



(57) 



where the latter expression follows from the fact that 
Ek[X];AX]g.k = 2TSp^g, as implied by ^XX^ = I. 
Equation ( fSTT i implies the estimation error is temporally white 
with if-conditional spatial correlation 



D 



2T\ N 
2T\ N 



H 



H 



I 
I 



/3diag(*) 



+ p{a'-^,i.,{HH-)+l) 



(58) 



(59) 



Our final claim is that the channel estimation error 
^WX is statistically equivalent to D^H, with H e 
^MxN constructed from i.i.d CAf{0,l) entries. This can be 
seen from the following: 

= E|^[i:»5]„^,[^],,p^[Di];jjy-]*,| (60) 

= J2iDKAD^rn,i^{[H]kAH]l,} (61) 

(63) 

where we used the fact that E {[fl']fe_p[jy'];*^} = 5k-i5p-q. 



Appendix B 
Interference Cancellation Details 

In this appendix, we characterize the channel-estimate- 
conditioned covariance of the aggregate interference Vj, whose 
expression was given in (fTTT i. 

Recalling that D = E{D\H}, we first establish that 
Cov{D^^a; | H] = Z) tr(Cov(a;)), which will be useful in 
the sequel. To show this, we examine the (m, nf^ element of 
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the covariance matrix: 



[Coy{d"^Hx\H)U^^ 

= 'E{[d"^HxU[D^-Hx]1\H] (64) 



p^r,q,t 



xE{[^]p,,[^];,}E{M,[a;]:} 



[£)]„^„ tr(Cov{a;}). 



(65) 
(66) 



Rewriting the previous equality in matrix form, we 
get the desired result. As a corollary, we note that 
E{{D^H)Cov{x}{DiH)^\H} = £) tr(Cov{a;}), which 
will also be useful in the sequel. 

Next we characterize the {Hsr, -Ffrr)-conditional co- 
variance of the receiver distortion e^. Recalling that 
Cov{er} = /3diag(^'r) where ^>r = Covjur}, we 
have Cov{er I -H'sr, -ffrr} = /3diag(^r) where #r — 
Cov{ur\ Hsr, Hrr}- Then, given that Ur = t/f — Cr with 
t/r from ([Tol l, and using the facts that Cov(a;s + Cs) = 
Qs + Kdiag(Qs) and Cov(a;r + Cr) ~ Qr + Kdiag{Q,), we 
get 



*r = PrHsriQs + K diag{Q ^)) H ^ 

+ E { (r>|^sr) (Qs + K cliag(Q3)) (DlrHsr)^ | J^sr} 

^ H 

+ j]rH„{Q, + Kdiag{Qr))H„ 

+ E {{dIh„){Q, + «diag(g,))(Z),?i?rr)^ I -^rr} 
+ 1 (67) 

" H 



-Dsrtr(Q3 + Kdiag(Q3)) 

-V H 

Tyr-ffrr {Qr + K diag(Q|.)) fffr 
-Drr tr(Qr + « diag(Qr)) + 



Then, 



*r = Pr-ffsr(Qs + diag(Q3)) ^Tg, + (1 + K)Dsrtr{Q^) 

H 

+ i],H„{Q, + KdrAg{Qj))H„ 

+ {l + K)b„tv{Q,) + I (69) 

«Pri?srQs-H'sr + -Dsrtr(Q3) 
^ H 

+ rirH„Q,H„ + D„ tr(Qr) + /, 
where, for the approximation, we assumed k ^ 1. Thus, 
Cov{er I ifsr, Hn} 

« /3(pr diag(H'3rQs«'sr) + ^sr tr(Q3) 

+ ?7r diag(fl-rrQr^r^) + £>rr tr(Q,) + j) . (71) 



Finally we are ready to characterize Sr, the {H^j^Hxj)- 
conditional covariance of Vj. From ( fTTT i. 

Sr = Kpr E {H^, diag{Q^)H^, | -ffsr} + £>sr tr(Q3) 
+ K?7r E {Ifrr diag(Q,)i?[; I H„] + -Drr tr(g,) 
+ / + Cov{er I ffsr, -fifrr} (72) 

= Kpri^sr diag(Q3)i?3r + / + Cov{e3 | J?3r, Ifrr} 
+ «;E {(D|^3r) diag(Q3)(r>|^3,)H I ifsr} 
+ i?srtr(Q3) + b„tv{Q,) + Kr7riTrrdiag(Qr)Jfrr 
+ K E { (£>! Hrr) diag(Qr) (^5rf -ff | Hrr } (73) 

= Kpr^r3r diag(Q3)^r3r + (1 + K)£>3r tr(Q3) 

+ K?7r#rrdiag(gr)-ffrr + (1 + K)-Drrtr(gr) 
+ 7 + Cov{e3 I i^sr, ifrr} (74) 
« / + KprJ^sr diag(Q3)i?3r + r>sr trCQg) 
+ K'q,H„diag{Q,)H„ + i?rrtr(Qr) 
+ /3pr diag(i7-srQs-H'sr) + /3'?r diag(i7rrQr-ffr^), (75) 



where, for the approximation, we assumed k ^ 1 and /3 ^ 1, 
and we leveraged ( ItTI ). 



Appendix C 
Gradient Details 



In this appendix, we derive an expression for the gradient 
Vqj/]/(Q, C) by first deriving an expression for the derivative 
and then using the fact that ^ q,[i]L = 2(g^)*- 

To do this, we first consider the related problem of com- 
puting the derivative d dei(Y)/ dX, where 



(68) F^Cdiag(X)D + diag(£;XF) + Gtr(X) + Z, (76) 



and where (|76]l can be written elementwise as 

= J2[C],,rr.[XU.n[DUjSm-n + [Z],^j (77) 



(70) Notice that, for Vr.s defined as a zero-valued matrix except 
for a unity element at row r and column s, we have 



ddctjY) 
dX 



ddct{Y) 
' d[Xl,s 

ddct{Y) d[Y] 



d[Y],., d[X]r,^ 



(78) 
(79) 



■/(Qs[l],Qs[2],Qr[l],Qr[2],C) 

logdet(i:r[Z]))} (83) 

-(1 - C) logdet (^Pd-ffrd diag(g,[;])fl'^d + /3pd diag(fl-rdQJ/]ir^d) + -^rd tiQM + Z2[l]) 
+Clogdet (/?77rdiag(^rrdQrW-H'rd) + KVrH„diag{Q,[l])H^, + b„tiQ,[l] + Z3[«]) 

-C logdet (/?77rdiag(irrdQrW-H'rd) + KVrH„dia.giQ,[l])H^, + b„trQ,[l] + Z4[/])| (84) 
= ^-^^^r[l]{{H'^,[S,'[l] +PdmgiS^'[l] ~ ±-\l])]H,,f + ^diag (-ff^dC-^d - ±^\l])H,,)] 
+ ^rWsum(Ad (^d - §^r[l]{^ dmg {H^^.iSi'il] - ^:;\l])Hn) 

+/3(^"ddiag(5r'm - t;\l])Hray} + :^t[1] sum {b„ Q {S;^ [I] - t;\l]y)l. (85) 



d 



d 



]T[^]{(l-C)(logdct(5dm)-logdct(Sd[/]))+C(logdct(5r[Z]) 
-t[1][{1 - C) logdet (pd-ffrdQrW^S + ^^Pd^rddiag(Q,H)£r,'; 



Then, using dTTb . we get 
adet(Y) 

^ 9det(y) 



dX 



([C']i,r[-D]sj<5r- 



, ^ /adetYN"^^ 
diag D ( 1 C 



sum G 



ay 

adetl^ 
dY 




Fdiag ( I E 



dY 



det(l")(^diag(Dl^"'C) + (Fdiag(r"')£; 
+ sum{GQ{Y-^y)l), 



(81) 



(82) 



where, for the last step, we used the fact that ^'^gy'^'' = 
det{Y){Y-^y. 

Applying ( |82] | to (l22T i. we can obtain an expression for 
g^jj. To do so, we think of Z in ( |76] | as representing the 
terms in / that have zero derivative with respect to Qr[l]. 
Using S(i[l] and Sr[l] defined in (|37]|-(|38]|, and recalling the 
expression for Sd[^] in (O, the result is given in (l85T l, at the 
top of the page. 

Finally, using Gr[l] = 2(g^|j)*, and leveraging the fact 

that S'd[^], Sr[l], Sd[Z], and Sr[^] are Hermitian matrices, we 
get the expression for Gr[l] in ( l36l ). A similar expression 
results for Gs[l]- 



References 

[1] M. Jain, J. I. Choi, T. M. Kim, D. Bharadia, S. Setli, K. Srinivasan, 
P. Levis, S. Katti, and P. Sinha, "Practical, real-time, full duplex 
wireless," in Pmc. ACM Internal. Conf. Mobile Comput. & Netw., (Las 
Vegas, NV), pp. 301-312, Sept. 2011. 



[2] E. Everett, M. Duarte, C. Dick, and A. Sabharwal, "Empowering full- 
duplex wireless communication by exploiting directional diversity," in 
Pmc. A.silomar Conf. Signals Syst. Comput. pp. 2002-2006, Nov. 2011. 

[3] Y. Hua, "An overview of beamforming and power allocation for MIMO 
relays," in Pmc. IEEE Military Commnn. Conf., (San Jose, CA), 
pp. 375-380, Nov. 2010. 

[4] B. Wang, J. Zhang, and A. H0st-Madsen, "On the capacity of MIMO 
relay channels," IEEE Trans. Inform. Theory, vol. 51, pp. 29^3, Jan. 
2005. 

[5] S. Simoens, O. Muiioz-Medina, J. Vidal, and A. del Coso, "On the 
Gaussian MIMO relay channel with full channel state information," 
IEEE Trans. Signal Process., vol. 57, pp. 3588-3599, Sep. 2009. 

[6] D. W. Bliss, P. A. Parker, and A. R. Margetts, "Simultaneous transmis- 
sion and reception for improved wireless network performance," in Proc. 
IEEE Workshop Statist. Signal Process., (Madison, WI), pp. 478^82, 
Aug. 2007. 

[7] P. Larsson and M. Prytz, "MIMO on-frequency repeater with self- 
interference cancellation and mitigation," in Proc. IEEE Veh. Tech. Conf, 
(Barcelona, Spain), pp. 1-5, Apr 2009. 
[8] T. Riihonen, S. Werner, and R. Wichman, "Spatial loop interference 
suppression in full-duplex MIMO relays," in Proc. Asilomar Conf 
Signals Syst. Comput., (Pacific Grove, CA), pp. 1508-1512, Nov. 2009. 
[9] T. Riihonen, S. Werner, and R. Wichman, "Residual self-interference in 
full-duplex MIMO relays after null-space projection and cancellation," 
in Proc. Asilomar Conf. Signals Syst. Comput., (Pacific Grove, CA), 
pp. 653-657, Nov. 2010. 

[10] T. Riihonen, A. Balakrishnan, K. Haneda, S. Wyne, S. Werner, 
and R. Wichman, "Optimal eigenbeamforming for suppressing self- 
interference in full-duplex MIMO relays," in Proc. Conf. Inform. Science 
& Syst. (Baltimore, MD), pp. 1-5, Mar. 2011. 

[11] S. Sohaib and Daniel K.C. So, "Asynchronous polarized cooperative 
MIMO communication," in Proc. IEEE Veh. Tech. Conf, (Barcelona, 
Spain), pp. 1-5, Apr. 2009. 

[12] J. Sangiamwong, T. Asai, J. Hagiwara, Y. Okumura, and T. Ohya, "Joint 
multi-filter design for full-duplex MU-MIMO relaying," in Proc. IEEE 
Veh. Tech. Conf, (Barcelona, Spain), pp. 1-5, Apr 2009. 

[13] B. Chun, E.-R. Jeong, J. Joung, Y. Oh, and Y. H. Lee, "Pre-nuUing 
for self-interference suppression in full-duplex relays," in Proc. APSIPA 
Annual Summit and Conf, (Sapporo, Japan), pp. 91-67, Oct. 2009. 

[14] B. Chun and Y. H. Lee, "A spatial self-interference nullification method 
for full duplex amplify-and-forward MIMO relays," in Proc. IEEE 
Wireless Commun. & Netw. Conf, (Sydney, Australia), pp. 1-6, Apr 
2010. 

[15] P. Lioliou, M. Viberg, M. Coldrey, and F. Athley, "Self-interference 
suppression in full-duplex MIMO relays," in Proc. Asilomar Conf. 
Signals Syst Comput, (Pacific Grove, CA), pp. 658-662, Oct. 2010. 

[16] G. Santella and F. Mazzenga, "A hybrid analytical-simulation procedure 
for performance evaluation in M-QAM-OFDM schemes in presence of 



13 



nonlinear distortions," IEEE Trans. Veh. Tech., vol. 47, pp. 142-151, 
Feb. 1998. 

[17] H. Suzuki, T. V. A. Tran, I. B. Collings, G. Daniels, and M. Hedley, 
"Transmitter noise effect on the performance of a MIMO-OFDM hard- 
ware implementation achieving improved coverage," IEEE J. Sel. Areas 
Commun., vol. 26, pp. 867-876, Aug. 2008. 

[18] R. M. Gray and T. G. Stockham, Jr, "Dithered quantizers," IEEE Trans. 
Inform. Theory; vol. 39, pp. 805-812, May 1993. 

[19] W. Namgoong, "Modeling and analysis of nonUnearities and mismatches 
in AC-coupled direct-conversion receiver," IEEE Trans. Wireless Com- 
mun., vol. 4, pp. 163-173, Jan. 2005. 

[20] B. Hassibi and B. M. Hochwald, "How much training is needed in 
multiple-antenna wireless links," IEEE Trans. Inform. Theory, vol. 49, 
pp. 951-963, Apr. 2003. 

[21] D. Tse and P. Viswanath, Fundamentals of Wireless Communication. 
New York: Cambridge University Press, 2005. 

[22] D. Bertsekas, Nonlinear Programming. Athena Scientific, 2nd ed., 1999. 

[23] Y. Rong and Y. Hua, "Optimal power schedule for distributed MIMO 
hnks," IEEE Trans. Wireless Commun., vol. 7, pp. 2896-2900, Aug. 
2008. 



Brian P. Day received the B.S. in Electrical and 
Computer Engineering from The Ohio State Uni- 
versity in 2010. Since 2010, he has been working 
toward the Ph.D degree in Electrical and Computer 
Engineering at The Ohio State University. His pri- 
mary research interests are full-duplex communica- 
tion, signal processing, and optimization. 




Adam R. Margetts received a dual B.S. degree 
in Electrical Engineering and Mathematics from 
Utah State University, Logan, UT in 2000; and the 

U M.S. and Ph.D. degrees in Electrical Engineering 
"■^>''^ from The Ohio State University, Columbus, OH in 
2002 and 2005, respectively. Dr. Margetts has been 
with MIT Lincoln Laboratory, Lexington, MA since 
^ 2005 and holds two patents in the area of signal 

W processing for communications. His cuiTent research 

interests include distributed transmit beamforming, 
cooperative communications, full-duplex relay sys- 
tems, space-time coding, and wireless networking. 




Daniel W. Bliss is a senior member of the tech- 
nical staff at MIT Lincoln Laboratory in the Ad- 
vanced Sensor Techniques group. Since 1997 he 
has been employed by MIT Lincoln Laboratory, 
where he focuses on adaptive signal processing, 
parameter estimation bounds, and information the- 
oretic performance bounds for multisensor systems. 
His cuiTent research topics include multiple-input 
multiple-output (MIMO) wireless communications, 
MIMO radar, cognitive radios, radio network per- 
formance bounds, geolocation techniques, channel 
phenomenology, and signal processing and machine learning for anticipatory 
medical monitoring. 

Dan received his Ph.D. and M.S. in Physics from the University of 
California at San Diego (1997 and 1995), and his BSEE in Electrical 
Engineering from Arizona State University (1989). Employed by General 
Dynamics (1989-1991), he designed avionics for the Atlas-Centaur launch 
vehicle, and performed research and development of fault-tolerant avionics. 
As a member of the superconducting magnet group at General Dynamics 
(1991-1993), he performed magnetic field calculations and optimization for 
high-energy particle-accelerator superconducting magnets. His doctoral work 
(1993-1997) was in the area of high-energy particle physics, searching for 
bound states of gluons, studying the two-photon production of hadronic 
final states, and investigating innovative techniques for lattice-gauge-theory 
calculations. 



Philip Schniter received the B.S. and M.S. degrees 
in Electrical and Computer Engineering from the 
University of Illinois at Urbana-Champaign in 1992 
and 1993, respectively. From 1993 to 1996 he was 
employed by Tektronix Inc. in Beaverton, OR as 
a systems engineer, and in 2000, he received the 
Ph.D. degree in Electrical Engineering from Cornell 
University in Ithaca, NY. Subsequently, he joined the 
Department of Electrical and Computer Engineering 
at The Ohio State University in Columbus, OH, 
where he is now an Associate Professor and a mem- 
ber of the Information Processing Systems (IPS) Lab. In 2003, he received 
the National Science Foundation CAREER Award, and in 2008-2009 he was 
a visiting professor at Eurecom (Sophia Antipolis, France) and Supelec (Gif- 
sur-Yvette, France). Dr Schniter's areas of interest include statistical signal 
processing, wireless communications and networks, and machine learning. 




